Extending the Cochran rule

نویسندگان

  • Paul Rayson
  • Damon Berridge
  • Brian Francis
  • PAUL RAYSON
  • DAMON BERRIDGE
  • BRIAN FRANCIS
چکیده

We first describe a number of inter-related issues that need to be considered by the researcher when comparing frequencies of linguistic features in two or more corpora. We then describe the chi-squared and log-likelihood tests used in previous research for the comparison of word frequencies. Our focus, in this paper, is on the issue of reliability of the statistical tests, and we describe simulation experiments to compare the reliability of the chisquared and log-likelihood statistics under conditions of different-sized corpora and probability of a word occurring in text. We observe that the Cochran rule provides a good guide to accuracy of both statistics in general, but in some cases it needs to be extended. We conclude by recommending higher cut-off values for the Cochran rule at the 5%, 1% and 0.1% levels. In order to extend applicability of the frequency comparisons to expected values of 1 or more, use of the log-likelihood statistic is preferred over the chi-squared statistic, at the 0.01% level. The trade-off for corpus linguists is that the new critical value is 15.13.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending the Cochran rule for the comparison of word frequencies between corpora

We first describe a number of inter-related issues that need to be considered by the researcher when comparing frequencies of linguistic features in two or more corpora. We then describe the chi-squared and log-likelihood tests used in previous research for the comparison of word frequencies. Our focus, in this paper, is on the issue of reliability of the statistical tests, and we describe simu...

متن کامل

The Thurston Norm and Harvey

Cochran introduced Alexander polynomials over non–commutative Laurent polynomial rings. Their degrees were studied by Cochran, Harvey and Turaev as they give lower bounds on the Thurston norm. We first extend Cochran’s definition to twisted Alexander polynomials. We then show how Reidemeister torsion relates to these invariants and we give lower bounds on the Thurston norm in terms of the Reide...

متن کامل

Reidemeister Torsion, the Thurston Norm and Harvey’s Invariants

Recently representations over non–commutative rings were used by Cochran, Harvey, Friedl–Kim and Turaev to define Alexander polynomials whose degrees give lower bounds on the Thurston norm. We first show how Reidemeister torsion relates to these invariants. We give lower bounds on the Thurston norm in terms of the Reidemeister torsion which contain all the above lower bounds and give an elegant...

متن کامل

m at h . G T ] 3 1 A ug 2 00 5 REIDEMEISTER TORSION , THE THURSTON NORM AND HARVEY ’ S INVARIANTS STEFAN

Recently twisted and higher order Alexander polynomials were used by Cochran, Harvey, Friedl–Kim and Turaev to give lower bounds on the Thurston norm. We first show how Reidemeister torsion relates to these Alexander polynomials. We then give lower bounds on the Thurston norm in terms of the Reidemeister torsion which contain and extend all the above lower bounds and give an elegant reformulati...

متن کامل

محیط ‌زیست شهری در پرتو حقوق شهروندی

Predecessors Create Cities That All People Were Proud of Them. We Are Also Making History. Will Posterity Speak Proudly of Their Ancestors? Iranian Cities Gradually Keep Away from the Environmental Requirements and Their Fundamental Role as a Place of Tranquility and Improving Quality of Life. Extending the Cities without Rules, Selling the Law of the City, Encroachment on Forests and Natural R...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004